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Fault-tolerant logical operations for qubits encoded by CSS codes are discussed, with emphasis 
on methods that apply to codes of high rate, encoding k qubits per block with k > 1. It is shown 
that the logical qubits within a given block can be prepared by a single recovery operation in any 
state whose stabilizer generator separates into X and Z parts. Optimized methods to move logical 
qubits around and to achieve controlled-not and Toffoli gates are discussed. It is found that the 
number of time-steps required to complete a fault-tolerant quantum computation is the same when 
k > 1 as when k — 1. 



I. INTRODUCTION 

Fault tolerant quantum computation is quantum com- 
putation of high fidelity carried out with physical qubits 
and operations that are noisy and imperfect. 'Fault tol- 
erance' covers a variety of concepts, but there are three 
main ones: (generalized) geometric or adiabatic phases, 
composite pulses, and quantum error correction (QEC). 
This paper is concerned purely with the latter. 

The main ideas for fault-tolerant universal quantum 
computation on encoded states were introduced by Shor 
0. Two aspects have to be considered: the error cor- 
rection or recovery process, which uses a noisy quantum 
network, and the implementation of quantum gates to 
evolve the logical state of the machine. This paper is 
concerned purely with the latter task, but we will study 
methods in which the two aspects are to some extent 
merged. 

The present work builds on a series of ideas that were 
established as follows. Shor's seminal work [| discussed 
CSS codes encoding a single qubit per block. It estab- 
lished such central concepts as the use of ancilliary en- 
tangled states that are partially verified, repetition of 
syndrome measurements, and a discrete universal set of 
logical operations. DiVincenzo and Shor 0] generalised 
the fault-tolerant syndrome measurement protocol to any 
stabilizer code, and Steane discovered the more effi- 
cient technique of using prepared logical zero states to 
extract syndromes, which will be adopted in this paper. 

Gottesman £| discovered fault-tolerant universal 
methods that can be applied to all stabilizer codes. The 
main new ingredient is to use measurements of observ- 
ables in the Pauli group, combined with preparation of 
'cat' states, to achieve desired operations. Teleportation 
in particular is used to extract an individual logical bit 
from one block and place it in another. Steane [5| showed 
that the measurements of Pauli observables required in 
Gottesman's methods can be absorbed into the syndrome 
measurement, so that they are achieved at close to zero 
cost. 

The important concept of 'teleporting a gate' or tele- 
porting qubits 'through' a gate was introduced by Nielsen 
and Chuang |(j and applied to fault-tolerant gate con- 



structions by Gottesman and Chuang 

In this paper we study methods for quantum codes 
encoding more than one qubit per block. We introduce 
extensions and generalisations of the ideas just listed, 
and identify networks requiring the least computation 
resources to perform a given operation. One interesting 
result is that the number of time steps required to com- 
plete a logical algorithm is the same for k = 1 and k > 1, 
where k is the number of logical qubits per block. This is 
because the methods allow much of the required process- 
ing to take place "off-line" , without interrupting the evo- 
lution of the computer. The "off-line" operations involve 
the preparation of ancilliary qubits in specific states, and 
the transfer of logical qubits to otherwise empty blocks 
by teleportation. 

The paper is organised as follows. SectionlTTIintroduces 
terminology and notation. Section IlIII lists some ways to 
achieve a universal set of fault-tolerant operations. Sec- 
tion llVl then presents our first main result (theorem 1 and 
its corollary). This is an extension of a theorem in [jj, 
it shows that CSS-encoded qubits can be fault-tolerantly 
prepared in a useful class of states by use of a single re- 
covery operation. We also discuss how to simplify some 
more general state-preparations by decomposing stabi- 
lizer operators into simpler components. 

SectionlVI gives a set of basic operations for CSS codes. 
The main aim is to discuss the transfer and teleporta- 
tion operations whose use for manipulating bits encoded 
by stabilizer codes was proposed by Gottesman 0. We 
list the constructions and present the most efficient im- 
plementation of teleportation between blocks. We use 
theorem 1 to avoid the need to prepare 'cat' states for 
preparing and measuring states, including states in the 
Bell basis of encoded qubits. 

Sections IVII and IVIII discuss implementation of the 
controlled-not and Toffoli gates respectively, between 
qubits encoded in the same block. 

II. TERMINOLOGY AND NOTATION 

The following notation will be adopted. The single- 
qubit operators X, Y and Z are the Pauli operators a x , 
a y and C7 Z , respectively, (it will be convenient to define 
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Y so that it is Hermitian, not real as is sometimes cho- 
sen in QEC discussions). We use H for the single-qubit 
Hadamard operation and S for the rotation about the z 
axis through tt/2 (phase shift of |1) by i). Thus S 2 = Z 
and (HSH) 2 = X. The general phase shift of |1) by 
exp(z^) will be written P(<f>), so S = P(n/2), Z = P(ir), 
etc. 

A controlled U operation is written C U, so for example 
C X is controlled- not, and T = CC X is the Toffoli gate. 

The logic gate hierarchy introduced in is defined 
recursively by 

Cj^iUlU&rf cCj-x}, (1) 

where C\ is the Pauli group (the set of tensor products of 
Pauli operators, including the identity / and il). Each Cj 
contains Cj-±. P(ir/2 : >) G Cj+i \ Cj where \ denotes the 
set difference. The Clifford group is C2 in the heirarchy 
QJ. By the definition of C2, this group is the normalizer 
of the Pauli group. It is generated by {H, S, C X} 0,0. 

All operators are understood to act on the logical, i.e. 
encoded qubits (operations on the physical qubits are dis- 
cussed in the appendix) . A blockwise operation is defined 
to be one such that the relevant operator acts on each of 
the logical qubits in a given block, or each corresponding 
pair in two blocks in the case of 2-qubit operators (block- 
wise action of 3- or more-bit operators will not arise in 
the discussion). 

We define an operation to be 'fault tolerant' if it does 
not cause errors in one physical qubit to propagate to 
two or more qubits in any one block. The fault tolerance 
of the operations used in the networks to be discussed is 
proved in the appendix. 

A block of n physical qubits stores k logical qubits. 
The notation M u , where u is an fc-bit binary word, means 
a tensor product of single-qubit M operators acting on 
those logical qubits identified by the Is in u (for example 
X\qx — X ® I ® X). The letters u,v,w,x,y,z when 
used as a subscript or inside a ket symbol (as in \x) L ) 
always refer to binary words. When we wish to treat a 
list of operators such as {Mi, i = 1 . . . k} then the letters 
i,j,r,p are used as subscripts. 

The notation X % = X 2 k-i or Z % = Z^h-t, where i is 
a number running from 1 to k, means a single operator 
applied to the i'th logical bit in a block. For example 
X 2 = A01000 for k = 5; N.B. no powers (greater than 1) 
of Pauli operators appear anywhere in this paper. 

A. Computational resources 

Most of the computational resources of the physical 
computer are dedicated to the QEC networks. The com- 
plete network to recover (= error-correct) a single block 
involves ~ nd 2 physical gates Q, where d is the mini- 
mum distance of the code, whereas the operations acting 
in between recoveries of a given block typically only in- 
volve n physical operators (one for each physical bit in 
the block). To assess the resources of the networks to be 



described we will therefore primarily count blocks and 
recoveries. 

Whenever a single block is recovered, all are, because 
the duration of the recovery network is assumed to be 
long enough that even 'resting' blocks accumulate signif- 
icant memory errors. We allow at most one set of gates 
connecting different blocks between successive recover- 
ies, to prevent avalanches of errors. However, we allow 
combinations of twin- and single-block operations, such 
as C X followed by H, without requiring a further recov- 
ery. We define one 'time step' to be the interval between 
the completion of one recovery, and the completion of the 
next. The 'area' of a network is defined to be the product 
(number of blocks) x (number of time steps). 

Measurement of logical bits, and preparation of logical 
bits in required states, is absorbed as much as possible 
into the recovery operations as described in section llVl 

Most of the operations on the computer are either mea- 
surements absorbed into recoveries or a physical gate ap- 
plied once to each bit in a block or pair of blocks (so- 
called 'transversal' application of a gate). We will treat 
in this paper the case where the QEC encoding is a CSS 
code based on a doubly even classical code, such that 
fault-tolerant Clifford group gates are relatively straight- 
forward (see section^) but the members of C3 (including 
the Toffoli gate and 9?, P(n/4)) are not. To implement 
the latter, we adopt Shor's method of preparing a block 
of n physical bits in the 'cat' state |0® n ) + 1 1®") and using 
it to measure Clifford group observables such as block- 
wise °X on encoded bits. This method is fault-tolerant, 
but it is an undesirable element because the noise associ- 
ated with preparing the cat states and connecting them 
to the data qubits is larger than that of a single transver- 
sal gate. Therefore we will aim to keep the use of such 
cat states to a minimum. 

We distinguish between 'offline' and 'online' parts of 
the networks to be discussed. The 'online' parts are so 
called because they involve operations on the logical data 
qubits of the computer, and therefore can only take place 
at the correct moment in the algorithm being computed. 
The 'offline' parts are state preparations which can take 
place at any time prior to when they are needed, and 
operations to move passive qubits (i.e. those not im- 
mediately involved in a logical gate) around in order to 
conserve memory blocks. The offline parts can proceed in 
parallel with other operations of the computer as long as 
there are sufficient spare blocks available, but the com- 
puter's algorithm cannot be evolved further while the 
online part of a given step is completed, because the al- 
gorithm (in all but rare instances) requires the logical 
operations to take place sequentially. This means that 
when considering the computation resources required for 
a given network, the most important cost measure is the 
duration of the online part. 

In the methods to be discussed, it often happens that 
data qubits are moved from one block to another in order 
to make it possible to apply logical operations to them. 
At any given moment, most blocks in the computer act 



3 



as memory, and a few act as an 'accumulator' where the 
logical operations take place. The movement of memory 
qubits too and from the accumulator is intermediate be- 
tween 'offline' and 'online'. For, suppose a data bit has 
been moved to an accumulator block and a logical op- 
eration has just been applied to it. In order to free the 
accumulator for further use, the bit must be moved out 
again. If this bit were required in the next logical oper- 
ation, however, then it is usually possible to apply the 
logical operation straight away, and move it afterwards. 
If the bit were not required, then the operation to move it 
back into memory could proceed offline, as long as there 
is another accumulator block available to allow the next 
logical gate to proceed at the same time. Therefore we 
will count each operation to move qubits from memory 
to accumulator as online, and operations to move them 
back to memory as offline. 



III. UNIVERSAL SETS 

In this section we will consider universal sets of quan- 
tum gates for which fault-tolerant constructions have 
been put forward. 

For operations on bare qubits, the most commonly con- 
sidered universal set of quantum gates is {U(0,4>), X} 
where U(8, </>) is a rotation of a single qubit through 9 
about an axis in the x — y plane specified by (f>. How- 
ever, this is not a useful set to consider for the purpose 
of finding fault-tolerant gates on encoded qubits, because 
U(9, <f>) is not readily amenable to fault-tolerant methods. 

Several different proposals for fault-tolerant universal 
sets have been put forward. All involve the Clifford 
group. The Clifford group is not sufficient for univer- 
sal quantum computation, nor even for useful quantum 
computation, since it can be shown that a quantum com- 
puter using only operations from the Clifford group can 
be efficiently simulated on a classical computer [l(J, [Tl| . 
To complete the set a further operator must be added, 
and it can be shown pj, ^| that an operator in C3 \ C2 
suffices. 

1. Shor proposed adding the Toffoli gate, mak- 
ing the universal set {H, S,°X, T} (or {R, S, C X, T} 
which is equivalent since R = HS 2 ). Obviously, 
can be obtained from T, but this does not reduce 
the set since Shor's method to obtain T assumes 
that °K is already available. 

2. {H, S, X, S} was considered for example by Knill, 
Lafiamme and Zurek [l2^. This is similar to (1) 
because ^5 and C X suffice to produce CC Z, which 
with H makes CC X = T. 

3. The same authors H2 also considered {S, C X, C S} 
together with the ability to prepare the encoded (or 
'logical') states \+) L = (|0) L + |1> L ) /y/2, \-) L = 
(|0) L — |1) L ) /y/2. This can be shown to be suffi- 



cient since preparation of |±) L together with S and 
X can produce H, and the rest follows as in (2). 

4. {H,S, c X,P(tt/4)} is the 'standard set' discussed 
by Nielsen and Chuang [Tl|. 

5. Knill et al. || proposed {H, S,°X} combined 
with preparation of |7r/8) L = cos(7r/8) |0) l + 
sin(7r/8) |1) L . The latter is prepared by making use 
of the fact that it is an eigenstate of H, and once 
prepared is used to obtain a C H operation, from 
which the Toffoli gate can be obtained. 

6. Gottesman |4j showed that C X, combined with the 
ability to measure A, Y and Z, is sufficient to pro- 
duce any operation in C2. The universal set is com- 
pleted by an operation in C3 \ C2 such as T. 

7. Shi proved that {H, T} is universal; some fur- 
ther insights are given by Aharonov 14) . 

Many of these methods are summarized and explained 
in , where the proof of universality and the efficiency 
of approximating a continuous set with a discrete one 
(Solovay-Kitaev theorem) is also discussed. 

(1) is a useful starting point and we will use it in this 
paper, but generalized to [[n, k, d]] codes storing more 
than one qubit per block. Similar methods apply to (2) 
and (4). A generalization of the ideas of Knill et al. used 
for (2) is given in the appendix; however, the codes for 
which it works turn out to be non-optimal. (5) will not 
be adopted because it is slow, requiring 12 preparations 
of |7r/8) L for every Toffoli gate, and the preparation is 
itself non-trivial. (6) is important because measurement 
of A, Y and Z can be performed fault-tolerantly for any 
stabilizer code, not just [[n, l,d]] codes. Gottesman also 
proposed the use of measurements and whole-block oper- 
ations to swap logical qubits between and within blocks. 
(7) is a nice result, but the known fault-tolerant construc- 
tions for T assume that fault-tolerant versions of other 
gates such as C X are already available, so this 'minimal' 
set has not so far been used to generate fault-tolerant 
universal computation. 

The Gottesman methods rely heavily on measurement, 
which might be thought to be disadvantageous. In fact, 
since the measurements can be absorbed into the recov- 
eries (see section llVl and p|) they are available at no 
cost and therefore are advantageous. In any case all the 
methods involve measurement and/or state preparation 
to implement the Toffoli or an equivalent gate. Since any 
useful quantum computation must make significant use 
of gates outside the Clifford group (otherwise it could 
be efficiently simulated classically), the methods are all 
roughly equivalent in this regard. For example, the speed 
of Shor's algorithm to factorize integers is limited by the 
Toffoli gat es required to evaluate modular exponentials 
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IV. MEASUREMENT OF LOGICAL PAULI 
OBSERVABLES 

Theorem 1. For any CSS code, measure- 
ment of a set M. of logical observables in the 
Pauli group can be performed at almost no 
cost by merging it with a single recovery op- 
eration, as long as the set has the following 
properties: every M £ M. is of the form either 
X u or Y u or Z u (i.e. a product of one type 
of Pauli operator), and not all three types of 
operator appear in the set. 

Theorem 1 was put forward in [j| for the case of mea- 
suring a single observable of the form X u , Y u or Z u . The 
method is to prepare an ancilla in \a u ) = \0) L +\u) L , then 
operate blockwise or °Y or C Z from ancilla to data, 
then measure the ancilla in the {|+) , |— )} basis. The 
measurement outcomes permit both an error syndrome 
and the eigenvalue of the relevant observable to be de- 
duced. The ancilla preparation is done fault-tolerantly. 
One fault-tolerant method is to produce an imperfect ver- 
sion of the desired state \a u ) by any means, and then 
to measure all those observables in the stabilizer of \a u ) 
that consist of only Z operators; the prepared state is 
rejected if any of these verifying measurements yield the 
wrong eigenvalue (-1), and in such cases a further prepa- 
ration attempt is initiated. Any prepared ancilla state 
that passes the verification does not have correlated X 
errors in it [l^, so can safely act as the control bits in a 
blockwise controlled gate with the data. Z errors in the 
ancilla preparation (whether correlated or not) cause the 
wrong syndrome and/or wrong eigenvalue of the observ- 
able being measured on the data to be deduced. This 
is guarded against by repetition and taking a majority 
vote. This vote corrects the effects of Z errors in the 
ancilla preparation; it explains why it was not necessary 
to measure the X-type stabilizer observables in the veri- 
fication step. The whole procedure is fault-tolerant if the 
noise is uncorrelated and stochastic. It is efficient if the 
initial preparation attempt has a non-negligible proba- 
bility of success (i.e. of producing \a u ) with no X, Y or 
Z errors). 

In the method just outlined, only a subset of the ob- 
servables in the stabilizer of \a u ) was measured in order 
to verify the ancilla. Other methods are possible. For ex- 
ample a measurement of the complete set of observables, 
combined with rotations conditional on the outcomes, is 
one way to prepare \a u ). Further copies could be pro- 
duced and then compared by controlled- not. 

To generalize to the complete result presented in the 
theorem, consider first a set of observables of a single type 
{M u } where M is either X or Y or Z. A measurement 
of any pair M u , M v is equivalent, both in the eigenvalue 
information obtained, and in the state projection which 
results, to measuring all members of the closed Abelian 
group {I, M u , M v , M U M V — M u+V }. Similarly, measur- 
ing the whole set is equivalent to measuring an Abelian 



group, and the corresponding binary vectors {u} form a 
linear vector space. The ancilla is prepared in 

l a w) = ( 2 ) 

u 

and the rest of the method proceeds as before. 

When the set M. to be measured contains members of 
two different types, the members of each type are mea- 
sured during each part of the syndrome extraction, that 
is, the syndrome extraction proceeds in two parts for CSS 
codes. These are normally envisaged to collect X-error 
and then Z-error syndromes, but we are free to choose 
any one out of the three pairs {X, Z}, {X, Y}, {Y, Z} to 
get the complete syndrome information. Each is obtained 
by operating the relevant type of controlled gate from an- 
cilla to data, so we can simultaneously measure the same 
combinations of observable types. We cannot measure 
single observables of mixed type because we only have 
blockwise controlled-gates of un-mixed type available. 

A. Logical state preparation 

Next we address preparation of logical states. In order 
to introduce notation, let us list the simplest measure- 
ments that theorem 1 permits, namely measurement of 
X, Z or Y on any single qubit in a block. These are 
indicated thus: 



Each group of lines in such a diagram represents the logi- 
cal qubits of a given block — by showing more than one we 
indicate that the operation can act on a single bit within 
the block. The dotted box indicates that the group of 
operations take place in a single step. 

Now, the measurement procedure is such as to leave 
the encoded block in an eigenstate of the measured ob- 
servable, in the logical Hilbert space. Furthermore, it is 
shown in the appendix that we can also apply Pauli op- 
erators to individual qubits, and groups of qubits, within 
a block. It follows that we can prepare any logical qubit 
in the eigenstate of eigenvalue +1 of any Pauli operator 
(by a measurement followed by application of an anti- 
commuting Pauli operator when the measured eigenvalue 
is — 1 ) . This gives the following set of basic fault-tolerant 
state preparations: 



I") 



where |±) = |0) ± |1) , \±i) = |0) ±i |1>. 

Measurements can be useful for preparing logical 
qubits not only in the standard states just listed, but 
also in entangled states. The class of logical states which 
can be prepared by the method described is a fairly large 
and powerful class: 
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Corollary to theorem 1. Any set of logical qubits 
within a given block can be prepared in a quantum code- 
word state of any quantum stabilizer code whose stabilizer 
separates into pure-X and pure-Z parts, using a single 
recovery. 

Note, the logical qubits remain encoded in their orig- 
inal 'inner' code; the corollary describes the preparation 
of certain superpositions of logical states. The corollary 
follows immediately from the remarks above: the recov- 
eries are used to measure the stabilizers of the outer code, 
which have the right form when the stabilizer separates 
as stated. The operator to move from a —1 to a +1 eigen- 
state is a tensor product of Pauli operators and so is also 
available. 

For example, the Bell state |00) L + lll)^ is a quantum 
codeword of a [[2, 0, 2]] CSS code with stabilizer XX, ZZ. 
The corollary allows us to prepare such states of pairs of 
logical qubits in the same block; this is very useful for 
teleportation. The following diagrams record this fact 
and give a slightly more complicated example, which we 
will use later and which further illustrates the method: 



io> — e- : r ^ 



mm 

|o> — 
|o> — 



(3) 



The first example is used in all the constructions pre- 
sented in the rest of this paper, see © to (fTTjl . 
The stabilizer for the 2nd example is generated by 
Xino, Xqiqi, Zqui, Ziqiq. For this case the ancilla 
used to extract the syndrome for Z errors is prepared 
in |0000) L + |1110) L + |0101) L + |1011) L ; the ancilla 
used to extract the syndrome for X errors is prepared 
in |0000) L + |0111) £ + |1010) L + |1101) L . ^ 

For the sake of clarity, let us examine the ancilla prepa- 
ration in a little more detail, by using preparation of 
|000) L + |H0) L in the ancilla as an example. Let Go and 
Hq be the generator and check matrices of the classical 
code Co which forms the zeroth quantum codeword (see 
equation l|18flV Go is (n — fc)/2 x n; H is (n + fc)/2 x n. 

The state |000) L may be p repared using a network ob- 
tained directly from G .18]. To prepare |000) L + |110) L 
it suffices to add the single row (llO)-D to Go and use 
the resulting matrix to construct the generator network 
(c.f. equation H2(J[): the expression (110)1? is a product 
of a row vector (110) with a 3 x n matrix D). 

Next we need to verify the state against X errors. The 
stabilizer of |000) L + |110) L has a Z part consisting of H 
with one row removed, and an X part consisting of Go 
plus the extra row (110)L> (since Xi 10 (|000) L + |110) L ) = 
|000) L + |110} L ). The verification only measures the Z 
part of the stabilizer. To identify the correct row of Hq 
to remove, note that Hq consists of the Z part of the 
quantum code stabilizer, which has (n — k)/2 rows and is 
the same as Go, plus k further rows which are the logical 
Z operators. The desired state is stabilized by Z\\q but 
not by Zioo or Zqiq. Therefore we replace the two rows 
-Zioo and Zqio in Hq by the single row Zhq. 



A useful further insight is provided by considering the 
quantity of information obtained by the adapted syn- 
drome extraction. This can be seen from a simple count- 
ing argument, as follows. A single quantum codeword 
such as \0} L in a CSS code is an equal superposition 
of 2 K product states in the computational basis, where 
k = (n — k)/2 is the size of the classical code Go (equa- 
tion H19|) ~). The Hadamard transformed state is then an 
equal superposition of 2"~ K product states. When we 
are using such a state to extract an error syndrome, for 
a zero syndrome we expect to observe one of these 2™~ K 
states. Correctable errors will transform the state onto 
an orthogonal one. There is a total of k bits of remain- 
ing room in Hilbert space for mutually orthogonal sub- 
spaces, so the measurement yields k bits of information, 
this is the error syndrome (for either X or Y or Z errors). 
If instead the state was originally prepared in \0) L + \u) L , 
then it consisted of an equal superposition of 2 K+1 prod- 
uct states. Upon being Hadamard transformed, it be- 
comes an equal superposition of 2"~( K+1 ) states, hence 
there are k + 1 bits of information about what has hap- 
pened to it available from measurements on it. These are 
the error syndrome and the eigenvalue of the measured 
observable, which are commuting observables so can be 
simultaneously measured. The argument extends in an 
obvious manner when further mutually commuting ob- 
servables are measured. 



B. More general state preparations 

The available tools for state preparation can be ex- 
tended as follows. We wish to prepare a state \4>) L 
of k logical qubits that is uniquely specified by a set 
{Mi}, (i = 1 • • • k) of k linearly independent commuting 
observables; this set generates the stabilizer of \4>) L in the 
logical Hilbert space. If \(j)) L — G |0® fc ) , then one possi- 
ble choice of the stabilizer operators is |l9j Mi — GZ l G^ . 
Define Qi = GX l G\ then each Qi anticommutes with 
its associated stabilizer operator and commutes with all 
the others: M t Qi = -Q,M, and [Mi, Qj^a] = 0. The Mi 
and the Qi all have eigenvalues ±1. 

One method to prepare \(j)} L is to measure all the Mj on 
some arbitrary input state in the code space, and when- 
ever an eigenvalue —1 is found, apply the operator Qi 
that moves the —1 eigenstate to the +1 eigenstate. How- 
ever, it may not be straightforward to measure one of 
more of the Mi fault-tolerantly. 

Let M r be a stabilizer operator whose fault-tolerant 
measurement is not straightforward. Decompose it as 
M r = 7V,.,i ® N r ,2 ■ ■ ■ ® N 7%p where there exists a state 
which is a +1 eigenstate of all the N r j simultaneously, 
and where the N r j are simpler to work with fault- 
tolerantly than M r , for example because they each act 
on fewer qubits. To prepare \4>) L , first prepare a +1 
eigenstate of all the N r< j, (j = 1- --p) (e.g. by measuring 
them if they commute), and then measure all the other 
Mi^ r . Typically the N r j will not commute with all the 
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Mijt r , but as long as the measurements are done in the 
order described the final state is the same as if M r had 
been measured. 

For example, suppose we require the input state 



> i = |00) L -|ll) i + |01) L + |10) J 



(4) 



This has stabilizer X W Z 01 = XZ, X 01 Z 10 = ZX. Nei- 
ther of these observables can be measured easily, but the 
product (XZ) (ZX) = YY can, since it is not of mixed 
type. We therefore adopt the set {M,-} = {XZ, YY}. 
Decomposing M\ — XI eg) IZ, we see it is sufficient to 
prepare a +1 eigenstate of X in the first qubit, and of Z 
in the second qubit, which is easy: the starting state is 
|00) i + |10) L . Upon measuring YY (and applying IZ if 
the measured eigenvalue is —1), \cf>) L is obtained. 

It was pointed out in |19| that the starting state which 
will produce \4>) L when a single stabilizer observable Mi 
is measured is the state (I + Qi) \<j>) L . This observation 
can also help in identifying suitable starting states. 

We can go further and split up further Mi operators 
into their components N^j as long as a +1 eigenstate 
of all the N operators at once can be prepared. For 
example, the state required for the Toffoli gate discussed 
in section lyTTl has a set of 8 stabilizer generators including 
X 1 X 5C X & \ X 2 X 6C X 57 , Z X Z 5 and Z 2 Z 6 . We split the 
first two of these into X 1 X 5 and C X 67 , X 2 X 6 and °X 57 
respectively. Preparing the 7th bit in \+) L is sufficient to 
ensure a +1 eigenstate of both the controlled-gates. At 
the same time we prepare the 1st and 5th bits in the Bell 
state |00) L + |11) L to ensure they are in a +1 eigenstate 
of X 1 X 5 and Z ,Z^ , and similarly for the 2nd and 6th 
bits — see l(T7j) . 



V. A FAULT-TOLERANT TOOLBOX 

We will now summarize some basic fault-tolerant oper- 
ations and methods that will be used in the constructions 
to be described. 

We restrict attention to CSS codes based on a doubly- 
even classical code that is contained by its dual. For such 
codes the following fault tolerant operations are easily 
available (see appendix): 
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1. Operators in the Pauli group, acting on any logical 
qubit or group of qubits in a block. 

2. Blockwise H and C X and hence C Z. 



3. S acting blockwise but such that different logical 
qubits may be acted on by different powers of S, 
depending on the code (see lemma 4 in appendix). 



A. Transfer operation 

Gottesman Q introduced the operation by which a 
state is transferred from one qubit to another by a single 
^X gate and a measurement, and its use in stabilizer 
codes to move a single qubit between blocks: 
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© shows two versions of the operati on ( referred to as ex- 
amples of 'one-bit teleportation' in |l9j). Since °X acts 
as an identity operator when either the control bit is in 
|0) or the target in |+), we can ensure the blockwise 
does not disturb other qubits in either the source block 
or the destination block, by preparing states accordingly. 
The next set of diagrams introduce a shorthand notation 
for transfer operations of the first type in ©, illustrat- 
ing various possibilities for the state preparations. In the 
first case a qubit is transferred out of a full block with- 
out disturbing the other bits in that block; in the last 
case a qubit is transferred into a full block without dis- 
turbing the other bits there; the middle example is an 
intermediate case: 
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The broken line followed by a zero is shorthand for mea- 
surement in the |0) , |1) basis followed by X if the — 1 
eigenvalue was obtained, thus leaving the qubit in state 
|0). The relevant point is that this state preparation does 
not need a further recovery, so it takes place in the same 
time-step as the rest of the transfer operation. 

An illustrative set of possible transfer operations of the 
second type in JSJ is: 
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The vertical bar after the line break is shorthand for 
preparation of |+), that takes place via the measurement 
in JHJl. 



B. Teleportation 



We define the following notation for teleportation: 
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This is used to move a qubit from one block to a different 
location in another block: 
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The initial Bell state preparation is done by a single re- 
covery as in so the complete network requires 3 time 
steps, these are shown separated by dashed vertical lines. 

The qubit is moved from the i'th position in the source 
block to the j'th position in the destination block. The 
network construction is straightforward when both the 
j'th and j'th qubits of the destination block are avail- 
able to be prepared in the Bell state, as in (J3J). The next 
network shows how to accomplish teleportation from a 
full block to another which has only one unused posi- 
tion. This requires two transfers to put the Bell state in 
the right place, and a naive construction would require 4 
time steps. However, the second transfer can take place 
simultaneously with the teleportation step: 
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This network has the interesting feature that the transfer 
and teleportation in the final step commute, and there- 
fore are applied simultaneously. One way to 'read' the 
network is to argue that the upper of the two simultane- 
ous blockwise °X gates creates a GHZ state |000) + |111) 
between the middle bits of the 1st two blocks and the 
upper bit of the 3rd; this entangled triplet replaces 
the entangled pair in the standard teleportation. X- 
measurements on two of these qubits are then needed 
to disentangle them from the one which is teleported. 



VI. CONTROLLED-NOT 

We now turn to implementing °X between any single 
pair of qubits. We treat the case where the qubits are 
in the same block, which will illustrate all the essential 
ideas. 

One method is to use two teleportations and a block- 
wise °X. A naive construction would require 3 + 1 + 3 = 7 
time-steps, but by choosing transfer operations that leave 
states ready-prepared for the subsequent step, and com- 
bining steps where possible, this is reduced to 5: 
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The shaded area is the offline part, where, as discussed 
in section 111 Al we count the initial teleportation (from 
'memory' to 'accumulator') as online, and the final tele- 
portation (from 'accumulator' to 'memory') as offline. 

The Bell-state measurement that forms part of the 
standard teleportation operation, see J3J|, begins with a 
°X gate involving one of the qubits of the entangled pair. 
However, when using whole-block operations it is easier 
to implement a group of rX gates such that both qubits 
of the entangled pair are operated on (either as target 
or control bits). We therefore consider the following net- 
work which teleports the second logical qubit (initially 
in state \y) L ), where the initial blockwise °X is imple- 
(10) mented without insisting that the first qubit is prepared 
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in \0) L (it is in some general state \x) L instead): 
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This shows that the result is a ~X operation between the 
first and second bits, with the second output bit tele- 
ported into the second block. Equation Ill2|) may be de- 
rived by starting with the right hand side (which shows 
a teleport followed by °X) and commuting the final °X 
backwards, as in To complete the °X operation, 

the target qubit can be teleported back to its original 
block at the end as in (|10fl . Using similar ideas to those 
in (|11|) . the complete network, including gathering the 
qubits into one block at the end, requires 4 blocks and 3 
time steps, of which 1 is online. 

The concept behind equation ljl"2"|) can be extended so 
as to achieve networks of Ci gates involving up to half 
the qubits in a block in a single online step, as long as 
the network finishes with a set of °K gates connecting the 
non-teleported bits to the teleported ones. For example: 
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The initial preparation step is an example of the corollary 
to theorem 1. This network is an example of a class of 
networks discussed below in connection with theorem 2. 

When x = z = 0, l|13fl is an example of the general 
method introduced by Gottesman and Chuang in |7|. 

If we introduce a further ancilliary block, the 
Gottesman-Chuang method can achieve °K between bits 
in the same block while teleporting the whole block, thus 
keeping its constituent logical bits together: 




(14) 



The offline state preparation shown in the dashed box 
can be accomplished in three time steps, by making use 





Network (JTTJ 




o 




offline online 


off. on. 


off. on. 


blocks 


4 4 


4 2 


2 3 


time steps 


3 2 


2 1 


3 1 


area 


13 5 


9 2 


6 3 



TABLE I: Summary of resources required by three networks 
for C X between bits in the same block. 



of the following equivalences: 
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The zeros just after the transfer operation represent state 
preparations that take place at the same time as the 
transfer. They ensure the final blockwise °X in (|15fl has 
the correct entangling effect. 

The resources required by the °X constructions of 
equations (|TT]l . (|T4"|) are summarized in table 1. 



A. Discussion 

For a code with k = 1 the gate we have discussed 
would be trivial: a single transversal C X suffices, fol- 
lowed by a single recovery. It is noteworthy that the 
more complicated (but more space-efficient) codes with 
k > 1 can achieve the gate without any slow-down: the 
online parts of l|12f) and (|14f) require only a single time 
step. Similar constructions can be found for other oper- 
ators in the group C2, using the general insight of com- 
muting gates backwards through teleportations Q, E|- 
The main contributions of the present study are the ex- 
tended use of recovery operations for preparing entangled 
states (avoiding the need for cat states), the minimiza- 
tion of time steps by careful construction in (|10fl . i|llfl . 
(|15fl . and the possibility of multi-qubit networks of C2 
gates in a single online step, as illustrated by ifH3|) . We 
now generalize the latter point. 

Theorem 2. Any network of gates in 
C2 (the Clifford group) can be applied fault- 
tolerantly to any group of logical bits (in the 
same or different blocks) using a single online 
time step. 

Proof: The result is obtained from applying the 
Gottesman-Chuang method illustrated in ljT4"l) not just 
to single gates such as °X or H, but to networks of 
gates. Suppose the bits involved in the network occupy 
N blocks. They are all teleported using N pairs of blocks. 
As long as all the gates in the network to be implemented 
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are in C2, they can all be commuted backwards through 
the Pauli operations involved in the teleportations such 
that still only Pauli operations are required to complete 
the teleportation. The final Pauli operations can then be 
applied all at once immediately after the measurements. 

Diagram (|13fl illustrates a related result: some net- 
works of C2 gates can be implemented among bits in a 
single block using only a single extra block. 



VII. TOFFOLI GATE 

Following we will use the following type of con- 
struction for the Toffoli gate: 




(16) 



This approach, rather than Shor's original network (re- 
lated to one-bit teleportation, see is adopted because 
it lends itself better to blockwise operations. In Ijl6|l a 
fourth qubit of each block is included in order to show 
what happens to the rest of the bits that are not involved 
in the gate itself. 

In order to keep the network as rapid as possible, all 
the measurements should take place together, and then 
whichever of the further operations are needed (condi- 
tional on the measurement results) should be applied as 
soon as possible. This flexibility in timing of the final 
operations is not shown in the diagram. 

The dashed box is an offline preparation which we 
will discuss below. Of the 8 measurements in (|Ttj)l . 5 
involve single-bit operators that can be applied (when 
needed) in the same time step as the blockwise °K and 
the measurements themselves. The other three involve 2- 
bit gates. Using the methods of either l|12|) or l|14|) each 
such gate needs only a single online time step, as long 
as sufficient spare blocks are available for offline prepa- 
rations and/or teleportations. However, they cannot all 
take place simultaneously if we retain the condition that 
only one two-block gate involving any given block is al- 
lowed per recovery, to prevent avalanches of errors. Of 
the 8 equiprobable measurement outcomes of this group 
of 3 measurements, one requires no action, three require 
a single time-step, three require 2 time-steps and one 
requires 3. The average number of online time steps re- 
quired by the complete network is therefore 13/8 ~ 1.6. 
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M 2 = 
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Qs 


= z z 


M 4 
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= Z A 
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= z 1 z 5 


Qs 


= x' 
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= Z 2 Z & 


Qe 


= x 2 


M 7 = 


Z Z Z 7C Z 56 


Q 7 


= x 3 


M 8 


= Z i Z» 


Qs 


= x 4 



TABLE II: Stabilizer operators Mi for the input state in the 
Toffoli gate network, with their associated anticommuting op- 
erators Qi. 



Let \4>)l be the state we need to prepare, as defined 
by the dashed box in l|16(l . The stabilizer of \<fi) L is gen- 
erated by the operators listed in table 2. Five of these 
operators are in the Pauli group C\, three are not in C\ 
but are in the Clifford group C2. Fault-tolerant measure- 
ment of the 5 Pauli group operators can be done through 
a recovery as in section IIV Al Fault-tolerant measure- 
ment of the 3 Clifford group operators can be done by 
Shor's cat state method 0. Shor described the method 
as applied to certain [[n, l,d\] CSS codes, we generalize 
it in the appendix to [[n, k,d]] codes of the type under 
discussion (lemma 5). 

We would like to minimise the need to prepare cat 
states. Recalling the discussion in section HV Bl we can 
factorize the stabilizer operators in any convenient way 
and prepare a +1 eigenstate of the component operators 
N r _j. By this means it is possible to avoid the need to 
measure any two out of Mi, M2 and M7. For example, 
the discussion at the end of section IIVBI showed how to 
avoid the need to measure Mi and M2. The complete 
state preparation indicated by the dashed box in (|16f) is 
then obtained with 
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where the diagram on the right explains the logical effect 
of the fault-tolerant diagram on the left. Bit number 6 
has been left in a separate block so that if the C Z 56 gate in 
(|16fl is needed then it can be implemented immediately. 
To minimise the number of online time steps bit 7 should 
also be positioned in a separate block. This can be done 
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using the same Bell-state preparation followed by transfer 
as is indicated in (|17l) for bits 2 and 6. The diagram 
shows an alternative approach that uses fewer blocks. 
Bit 6 (and 7 if necessary) can be repositioned back into 
the same block as 5 and 8 by teleportations after the end 

of dEJ. 



A. Discussion 

The network for the Toffoli gate between bits within a 
block involves at least 5 blocks (one of which is used for 
the cat state) and 1 cat-state-based measurement. The 
average number of online time steps is 13/8 if a further 
block is used, and slightly more than this otherwise. The 
main result is that the number of online time steps is in- 
dependent of k, and in particular is the same for [[n, k, d]] 
codes with k > 1 as for k = 1. Similar methods apply to 
other gates in the class C3. 



VIII. CONCLUSION 

We have considered fault-tolerant networks for logic 
operations on bits encoded in CSS codes, concentrating 
on codes based on a doubly-even classical code that is 
contained by its dual (some of the methods are more gen- 
eral). We have shown how to extend the use of the recov- 
ery operation to allow preparation of an interesting class 
of logical states (theorem 1 and its corollary). The imple- 
mentation of certain networks in a single online time step 
(theorem 2) is implicit in the Gottesman-Chuang work; 
we have shown that the offline state preparation for such 
networks can be accomplished efficiently using theorem 
1. 

We have presented optimized constructions of fault- 
tolerant networks for all the members of a universal set 
of operations. The optimization is primarily to min- 
imise on-line time steps, where one 'time step' is de- 
fined to include a single recovery of the whole computer. 
The constructions show that fault-tolerant operations for 
[[n, k > 1, d]] codes require the same number of time steps 
as those for [[n, 1, d]] codes. It follows that the total num- 
ber of recoveries needed to implement a complete algo- 
rithm is the same when k > 1 as when k = 1. The num- 
ber of individual block recoveries is smaller when k > 1 
because then there are fewer blocks, assuming the com- 
puter has more memory blocks than workspace. 

We would like to thank D. Lewis and S. O'Keefe for 
contributions to the development of the network designs. 
This work was supported by the EPSRC, the Research 
Training and Development and Human Potential Pro- 
grams of the European Union, the National Security 
Agency (NSA) and Advanced Research and Development 
Activity (ARDA) (P-43513-PH-QCO-02107-1). 



IX. APPENDIX: BASIC OPERATIONS FOR 
CSS CODES 

We describe the fault-tolerant implementation of the 
basic gates assumed in the main text. Some of the results, 
such as lemmas 2 and 3 were obtained by Gottesman 
using stabilizer methods. We derive them by a different 
method and add further information. 

Consider the effect of some operation (produced by a 
network of quantum gates or measurements) on the phys- 
ical qubits of one or more encoded blocks. We define an 
operation to be 'legitimate' if it maps the encoded Hilbcrt 
space onto itself. Transversal application of a two-bit op- 
erator is defined to mean the operator is applied once to 
each pair of corresponding physical bits in two blocks, 
and similarly for transversal three-bit operations across 
three blocks. Legitimate transversal operations are fault 
tolerant. 

Typically a legitimate transversal operation will result 
in a blockwise operation (defined in section^ c.f. lemma 
2), but this need not always be the case. 

The tilde as in U is used to denote the operation U 
applied to the physical qubits. Operators without a tilde 
are understood to act on the logical, i.e. encoded qubits. 
Thus L {u\U\v) L = (u\U\v). 

The CSS quantum codes are those whose stabilizer 
generators separate into X and Z parts [T^ . l20ll2lLl22ll23T . 
l24j . We restrict attention to these codes, rather than any 
stabilizer code, because they permit a larger set of easy- 
to-implement fault tolerant operations, and their coding 
rate k/n can be close to that of the best stabilizer codes. 
The CSS codes have the property that the zeroth quan- 
tum codeword can be written as an equal superposition 
of the words of a linear classical code Cq , 

\0)l=Y,\ x )> ( 18 ) 

xec 

where \x) is a product state, x is a binary word (1 x n 
row vector), and the other codewords are formed from 
cosets of Cq. Let D be the k x n binary matrix of coset 
leaders, then the complete set of encoded basis states is 
given by 

Ml = E \ x + uD )> ( 19 ) 

xec 

where u is a fc-bit binary word (1 x k row vector). 

Consider a CSS code as defined in eq. (f[T?)l . Then one 
possible choice for the encoded X and Z operators is 

X u = X uD (20) 

Z u = Z uD ^ d t D yi. (21) 

Equation (|20|l follows immediately from the code con- 
struction l|19|) . Eq. I|21|) may be obtained as follows. 
Since we are dealing with row vectors, the scalar prod- 
uct is x ■ y = xy T . Now, consider y S Cq: then 
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uD 



-l)y uD \x + uD ) and hence 

Z y \u) L = (-i)y- D \ u ) L 



but Z v \u) 



(-iy-»i 



so we need to solve v ■ u = y ■ (uD) for y: 



yD T u T Vit 
yD T 

vD{D T D)- 1 



(22) 
(23) 



(24) 
(25) 
(26) 



where we assume the inverse of the square matrix D T D 
exists. We will mostly be concerned with cases where 
D T D is an identity matrix. To check for consistency, we 
should confirm that y G Cq as was assumed — the proof 
of this is omitted here, but it is obvious for the case of a 
weakly self-dual code with D T D = I. 

Note that , when operating on codewords, X x+y is 
equivalent to X x for all y € Co, so each X operator is 
a member of a group of 2 K equivalent operators, where 
k = (n — k)/2 is the size of Co- Another way of seeing this 
is to note that since Co C Cq, X ye c is m the quantum 
code stabilizer. Similar statements apply to the Z oper- 
ators. The complete set of 2 2n Pauli X or Z operators 
on n bits is thus divided up as 



2(n-fc)/2 
2(n-k)/2 



2(n-k)/2 
2(n-fc)/2 



X-stabilizer members 
Z-stabilizer members 
X operators 
Z operators 
detectable X errors 
detectable Z errors 



Lemma 1. For [[n, 1, d]] codes where all words in |0) L 
have weight tq mod w, and all words in \ l) L have weight 
n mod w, transversal application of the following are le- 
gitimate: P(2n/w), c P(Ait/w), cc P(8tt/w), and achieve 
respectively P(2nr /w), c P(Ar-K /w), cc P(8rir/w), where 
r = n - r . 

Lemma 1 applied to codes with w = 8 or more provides 
a quicker way to generate the Toffoli gate T and its part- 
ners c lS and P(n/A) than has been previously discovered. 
The concept generalizes to ccc P(16ir/w) and so on, but 
the codes for which this is useful (i.e. having w > 16) are 
either inefficient or too unwieldy to produce good error 
thresholds. 

Proof: for clarity we will take r$ = and r% = r, the 
proof is easily extended to general ro- The argument 
for c P(4:Tr/w) was given in [l2j, but we shall need it for 
cc P(8n/w), so we repeat it here. Consider ^(An/w) ap- 
plied to a tensor product of two codewords. Let x, y be 
binary words appearing in the expressions for the two 
codewords, and let a be the overlap (number of posi- 
tions sharing a 1) between x and y. Let |x| denote the 
weight of a word x. Then 2a = \x\ + \y\ — \x + y\. 
There are three cases to consider. First if x, y S Co then 
\x\ = mod w, \y\ = mod w and \x + y\ = mod w 



so 2a = mod w from which a = mod w/2. Therefore 
the multiplying factor introduced by the transversal op- 
eration is 1. If x € Co and y £ C\ then x + y € C\ so 
|x| = mod w, \y\ — \x+y\ = r mod w so 2a = mod w 
again. If x, y € C\ then x + y <E Co so a = r mod w/2 
and the multiplying factor is exp(ir47r/iy). The result- 
ing operation in the logical Hilbert space is therefore 
Cp^rTr/w). 

Next consider cc P(8ir/w) applied to a tensor product 
of three codewords. Let x, y, z be words appearing in 
the three codeword expressions, and a, b, c be the overlap 
between x and y, y and z, and z and x, respectively. Let 
d be the common overlap of x, y and z, so 

\x + y + z\ = \x\ + \y\ + \z\ - 2a - 2b - 2c + Ad. (27) 

There are four cases to consider. If x,y,z € Co then 
d = mod w/A. If x, y £ Co, z g C\ then \x + y + z\ — 
| z |, 2a = 2b = 2c = mod w from the argument just 
given, therefore d — mod w/A. If x 6 Co, y, z € Ci 
then rr + y + zG Co, 2a = 2c = mod w while 26 = 
2r mod w = \y\ + \z\ so again d — mod u>/4. If x,y, z € 
C\ then a; + y + z G Ci, 2a = 2b = 2c = 2r mod w, 
therefore d = r mod w/A. The overall effect is that of 
the operation cc P(8rir/w). QED 

Lemma 2. Transversal ^ is legitimate for all CSS 
codes, and acts as blockwise C X. 

Proof: transversal C X acts as follows: 

c i > tl » L | t ;) i = y, Y,\ x+uD )\y +vD+x+uD ) 

xGC y£C 

= \u) L \u + v) L . (28) 

This is X from each logical qubit in the first block to 
the corresponding one in the second. QED 
Lemma 3. Transversal H and Z are legitimate for 
any \[n, 2k c — n, d]] CSS code obtained from a [n, k c , d] 
classical code that contains its dual, giving the effects 



2 fc -l 

HtAu) L = ^(-1)" 



DD 1 v 1 



1 L ' 



(29) 



v=0 



c Z tI \u) L \v) L = (-ir DD » \u) L \v) L . (30) 

Equation l|29|l is a blockwise H when DD T = I, and is 
a closely related transformation when DD T ^ /. Equa- 
tion (|30|l is a blockwise °Z when DD T = I, and a related 
transformation otherwise. 

Proof: transversal H acts as follows on \u) L : 

Btr E \ x+uD ) = E (-ir DyT \y)- (3i) 
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If Cq contains its dual Co, as required for lemma 3, then 
D and Co together generate Cq, so this can be written 

Htr\u) L = ]T J2(~V UDDTvT \ X + VD ) 
v=Q xec 

2 k -l 

= Y.^ 1 ) UDDTvT \ v )l (32) 

where to simplify the power of (—1) in the first equation 
we used the fact that Co is generated by the parity check 
matrix of C5 1 , so uD satisfies the parity check x £ Co. 

Equation (|30l) is proved straightforwardly by expand- 
ing \u) L and \v)t as in (|19|) . and then using (x + uD)(y + 
vD) T = uDD v T mod 2 for all the terms in the sum 
when Co C Cq 1 . 

Lemma 4. Let C be a [n, k c , d] classical code that 
contains its dual, and for which the weights of the rows 
of the parity check matrix are all integer multiples of 4. 
Then transversal S is legitimate for the [[n, 2k c — n, d\] 
CSS code obtained from C, and has the effect 

S tI \u) L =i^\u) L . (33) 

The case DD T = I, which leads to a simple effect for 
transversal H , also simplifies transversal S. If DD T = I 
then every row of D has odd overlap with itself (i.e. odd 
weight) and even overlap with all the other rows. Using 
an argument along similar lines to that in the proof of 
lemma 1, we deduce that the effect is the S r operator 
applied to every logical qubit in the block, where r is the 
weight of the relevant row of D. 

Proof: We will prove lemma 4 by showing that all the 
quantum codewords have \x + uD\ = \uD\ mod 4, so the 



weights modulo 4 of the components in i|19|) depend on 
u but not on x. The effect of transversal S will therefore 
be to multiply \u) L by the phase factor i\ uD \. 

The zeroth codeword is composed from the code Co = 
C 1 - generated by He, the parity check matrix of C. Let 
y and z be two rows of He, then the conditions of the 
lemma guarantee \y\ = mod 4 and \z\ = mod 4. Fur- 
thermore, since C contains its dual, each row of He sat- 
isfies all the checks in He, so y and z have even overlap 
2m. Therefore \y + z\ — Am mod 4 = mod 4, there- 
fore |ar| = mod 4 for all words in |0) L . Next consider 
a coset, formed by displacing Co by the vector w = uD. 
Since this coset is in C it also satisfies all the checks in 
He, therefore its members have even overlap with any 
x G Co- Hence if \w\ = r mod 4 then \x + w\ = r mod 4 
for all the terms in the coset, which proves the lemma. 
QED 

Lemma 5. For CSS codes in which transversal Z 
is legitimate, transversal CC Z is legitimate when operat- 
ing on two control blocks in the logical Hilbert space, 
and a target block in the space spanned by |0® n ), 
|l® n ). If transversal C Z has the effect \u) L \v) L -> 
^ _^uv \v) L , then transversal Z has the effect 

W)l \ v )l \ a ® n ) {-l) a{uvT) \u) L \v) L |a® n ), where a = 
or 1. 

Proof: Consider eq. I|3U|) and expand \u) L \v) L into a 
sum of 2n-bit product states \x) \y). The transversal Z 
operator can only have the effect (|30|l if the overlap of x 
and y is the same, modulo 2, for every term in the sum. 
Therefore the transversal Z operator as described in 
lemma 5 produces the same number of Z operations on 
the cat state, modulo 2, for every term in the correspond- 
ing expansion, and the effect is as described. QED 



[1] P. W. Shor. Fault-tolerant quantum computation. In 
Proc. 35th Annual Symposium on Fundamentals of Com- 
puter Science, pages 56-65, Los Alamitos, 1996. IEEE 
Press, quant-ph/9605011. 

[2] D. P. DiVincenzo and P. W. Shor. Fault-tolerant error 
correction with efficient quantum codes. Phys. Rev. Lett., 
77:3260-3263, 1996. 

[3] A. M. Steane. Active stabilisation, quantum computa- 
tion, and quantum state synthesis. Phys. Rev. Lett., 
78:2252-2255, 1997. quant-ph/9608026. 

[4] D. Gottesman. A theory of fault-tolerant quantum com- 
putation. Physical Review A, 57:127-137, 1998. quant- 
ph/9702029. 

[5] A. M. Steane. Efficient fault-tolerant quantum comput- 
ing. Nature, 399:124-126, 1999. quant-ph/9809054. 

[6] M. A. Nielsen and I. L. Chuang. Programmable quantum 
gate arrays. Phys. Rev. Lett, 79:321-324, 1997. 

[7] D. Gottesman and I. L. Chuang. Quantum teleportation 
is a universal computational primitive. Nature, 402:390, 
1999. quant-ph/9908010. 

[8] E. Knill, R. Laflamme, and W. H. Zurek. Resilient quan- 
tum computation: error models and thresholds. Science, 



279:342-345, 1998. 
[9] Andrew M. Steane. Overhead and noise threshold of 
fault-tolerant quantum error correction. Phys. Rev. A, 
68:042322, 2003. 

[10] D. Gottesman. The Heisenberg representation of quan- 
tum computers. 1998. quant- ph/9807006. 

[11] Michael A. Nielsen and Isaac L. Chuang. Quantum Com- 
putation and Quantum Information. Cambridge Univer- 
sity Press, Cambridge, 2000. 

[12] Emanuel Knill, Raymond Laflamme, and Wojciech 
Zurek. Accuracy threshold for quantum computation. 
quant-ph/ '9610011, 1996. 

[13] Y. Shi. Both toffoli and controlled-not need little help 
to do universal quantum computation. 2002. quant- 
ph/0205115. 

[14] Y. Shi. A simple proof that toffoli and hadamard are 

quantum universal. 2003. quant-ph/0301040. 
[15] D. Beckman, A. N. Chari, S. Devabhaktuni, and 

J. Preskill. Efficient networks for quantum factoring. 

Phys. Rev. A, 54:1034-1063, 1996. 
[16] Vlatko Vedral, Adriano Barenco, and Artur Ekcrt. 

Quantum networks for elementary arithmetic operations. 



13 



Phys. Rev. A, 54:147-153, 1996. 
[17] A. M. Steane. A fast fault-tolerant filter for quantum 

codewords, quant-ph/0202036. 
[18] A. M. Steane. Multiple particle interference and quantum 

error correction. Proc. Roy. Soc. Lond. A, 452:2551-2577, 

1996. 

[19] Xinlan Zhou, Debbie W. Leung, and Isaac L. Chuang. 

Methodology for quantum logic gate construction. Phys. 

Rev. A, 62:052316, 2000. 
[20] D. Gottesman. Class of quantum error-correcting codes 

saturating the quantum hamming bound. Phys. Rev. A, 

54:1862-1868, 1996. 



[21] A. M. Steane. Error correcting codes in quantum theory. 
Phys. Rev. Lett, 77:793-797, 1996. 

[22] A. R. Calderbank and P. W. Shor. Good quantum error- 
correcting codes exist. Phys. Rev. A, 54:1098-1105, 1996. 

[23] A. R. Calderbank, E. M. Rains, N. J. A. Sloane, and 
P. W. Shor. Quantum error correction and orthogonal 
geometry. Phys. Rev. Lett, 78:405-409, 1997. 

[24] A. R. Calderbank, E. M. Rains, P. W. Shor, and N. J. A. 
Sloane. Quantum error correction via codes over GF(4). 
IEEE Transactions on Information Theory, 44:1369- 
1387, 1998. 



